AITopics | open science

Collaborating Authors

open science

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FLIP Reasoning Challenge

Plesner, Andreas, Kuzhagaliyev, Turlan, Wattenhofer, Roger

arXiv.org Artificial IntelligenceApr-17-2025

Over the past years, advances in artificial intelligence (AI) have demonstrated how AI can solve many perception and generation tasks, such as image classification and text writing, yet reasoning remains a challenge. This paper introduces the FLIP dataset, a benchmark for evaluating AI reasoning capabilities based on human verification tasks on the Idena blockchain. FLIP challenges present users with two orderings of 4 images, requiring them to identify the logically coherent one. By emphasizing sequential reasoning, visual storytelling, and common sense, FLIP provides a unique testbed for multimodal AI systems. Our experiments evaluate state-of-the-art models, leveraging both vision-language models (VLMs) and large language models (LLMs). Results reveal that even the best open-sourced and closed-sourced models achieve maximum accuracies of 75.5% and 77.9%, respectively, in zero-shot settings, compared to human performance of 95.3%. Captioning models aid reasoning models by providing text descriptions of images, yielding better results than when using the raw images directly, 69.6% vs. 75.2% for Gemini 1.5 Pro. Combining the predictions from 15 models in an ensemble increases the accuracy to 85.2%. These findings highlight the limitations of existing reasoning models and the need for robust multimodal benchmarks like FLIP. The full codebase and dataset will be available at https://github.com/aplesner/FLIP-Reasoning-Challenge.

caption, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2504.12256

Genre: Research Report > New Finding (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Open Science and Artificial Intelligence for supporting the sustainability of the SRC Network: The espSRC case

Garrido, J., Sánchez-Expósito, S., Ruiz-Falcó, A., Ruedas, J., Mendoza, M. Á., Vázquez, V., Parra, M., Sánchez, J., Labadie, I., Darriba, L., Moldón, J., Rodriguez-Álvarez, M., Díaz, J., Verdes-Montenegro, L.

arXiv.org Artificial IntelligenceMar-20-2025

The SKA Observatory (SKAO), a landmark project in radio astronomy, seeks to address fundamental questions in astronomy. To process its immense data output, approximately 700 PB/year, a global network of SKA Regional Centres (SR-CNet) will provide the infrastructure, tools, computational power needed for scientific analysis and scientific support. The Spanish SRC (espSRC) focuses on ensuring the sustainability of this network by reducing its environmental impact, integrating green practices into data platforms, and developing Open Science technologies to enable reproducible research. This paper discusses and summarizes part of the research and development activities that the team is conducting to reduce the SRC energy consumption at the espSRC and SRCNet. The paper also discusses fundamental research on trusted repositories to support Open Science practices.

artificial intelligence, big data, data mining, (16 more...)

arXiv.org Artificial Intelligence

2503.16045

Country:

Europe > Switzerland (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Europe > Netherlands > North Holland > Haarlem (0.04)
Europe > Montenegro (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence (0.51)
Information Technology > Data Science > Data Mining > Big Data (0.34)

Add feedback

Typhoon T1: An Open Thai Reasoning Model

Taveekitworachai, Pittawat, Manakul, Potsawee, Tharnpipitchai, Kasima, Pipatanakul, Kunat

arXiv.org Artificial IntelligenceFeb-13-2025

This paper introduces Typhoon T1, an open effort to develop an open Thai reasoning model. A reasoning model is a relatively new type of generative model built on top of large language models (LLMs). A reasoning model generates a long chain of thought before arriving at a final answer, an approach found to improve performance on complex tasks. However, details on developing such a model are limited, especially for reasoning models that can generate traces in a low-resource language. Typhoon T1 presents an open effort that dives into the details of developing a reasoning model in a more cost-effective way by leveraging supervised fine-tuning using open datasets, instead of reinforcement learning. This paper shares the details about synthetic data generation and training, as well as our dataset and model weights. Additionally, we provide insights gained from developing a reasoning model that generalizes across domains and is capable of generating reasoning traces in a low-resource language, using Thai as an example. We hope this open effort provides a foundation for further research in this field.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.09042

Country:

North America (1.00)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI for Open Science: A Multi-Agent Perspective for Ethically Translating Data to Knowledge

Yakaboski, Chase, Hyde, Gregory, Nyanhongo, Clement, Santos, Eugene Jr

arXiv.org Artificial IntelligenceOct-31-2023

AI for Science (AI4Science), particularly in the form of self-driving labs, has the potential to sideline human involvement and hinder scientific discovery within the broader community. While prior research has focused on ensuring the responsible deployment of AI applications, enhancing security, and ensuring interpretability, we also propose that promoting openness in AI4Science discoveries should be carefully considered. In this paper, we introduce the concept of AI for Open Science (AI4OS) as a multi-agent extension of AI4Science with the core principle of maximizing open knowledge translation throughout the scientific enterprise rather than a single organizational unit. We use the established principles of Knowledge Discovery and Data Mining (KDD) to formalize a language around AI4OS. We then discuss three principle stages of knowledge translation embedded in AI4Science systems and detail specific points where openness can be applied to yield an AI4OS alternative. Lastly, we formulate a theoretical metric to assess AI4OS with a supporting ethical argument highlighting its importance. Our goal is that by drawing attention to AI4OS we can ensure the natural consequence of AI4Science (e.g., self-driving labs) is a benefit not only for its developers but for society as a whole.

agent, knowledge, open science, (15 more...)

arXiv.org Artificial Intelligence

2310.18852

Country:

North America > United States > New Hampshire > Grafton County > Hanover (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > Austria > Styria > Graz (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices

Cao, Hancheng, Dodge, Jesse, Lo, Kyle, McFarland, Daniel A., Wang, Lucy Lu

arXiv.org Artificial IntelligenceOct-4-2023

In recent years, funding agencies and journals increasingly advocate for open science practices (e.g. data and method sharing) to improve the transparency, access, and reproducibility of science. However, quantifying these practices at scale has proven difficult. In this work, we leverage a large-scale dataset of 1.1M papers from arXiv that are representative of the fields of physics, math, and computer science to analyze the adoption of data and method link-sharing practices over time and their impact on article reception. To identify links to data and methods, we train a neural text classification model to automatically classify URL types based on contextual mentions in papers. We find evidence that the practice of link-sharing to methods and data is spreading as more papers include such URLs over time. Reproducibility efforts may also be spreading because the same links are being increasingly reused across papers (especially in computer science); and these links are increasingly concentrated within fewer web domains (e.g. Github) over time. Lastly, articles that share data and method links receive increased recognition in terms of citation count, with a stronger effect when the shared links are active (rather than defunct). Together, these findings demonstrate the increased spread and perceived value of data and method sharing practices in open science.

data and method link-sharing practice, evolution and perceived value, open science, (1 more...)

arXiv.org Artificial Intelligence

2310.03193

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback

Investigating Reproducibility at Interspeech Conferences: A Longitudinal and Comparative Perspective

Arvan, Mohammad, Doğruöz, A. Seza, Parde, Natalie

arXiv.org Artificial IntelligenceAug-29-2023

Reproducibility is a key aspect for scientific advancement across disciplines, and reducing barriers for open science is a focus area for the theme of Interspeech 2023. Availability of source code is one of the indicators that facilitates reproducibility. However, less is known about the rates of reproducibility at Interspeech conferences in comparison to other conferences in the field. In order to fill this gap, we have surveyed 27,717 papers at seven conferences across speech and language processing disciplines. We find that despite having a close number of accepted papers to the other conferences, Interspeech has up to 40% less source code availability. In addition to reporting the difficulties we have encountered during our research, we also provide recommendations and possible directions to increase reproducibility for further studies.

artificial intelligence, natural language, reproducibility, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2023-2252

2306.10033

Country:

North America > United States > Maine > Kennebec County > Waterville (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Why diversity and inclusion needs to be at the forefront of future AI

RobohubJul-1-2023, 07:27:29 GMT

Inês Hipólito is a highly accomplished researcher, recognized for her work in esteemed journals and contributions as a co-editor. She has received research awards including the prestigious Talent Grant from the University of Amsterdam in 2021. After her PhD, she held positions at the Berlin School of Mind and Brain and Humboldt-Universität zu Berlin. Currently, she is a permanent lecturer of the philosophy of AI at Macquarie University, focusing on cognitive development and the interplay between augmented cognition (AI) and the sociocultural environment. Neurourbanism as a Novel Approach in Global Health,' funded by the Berlin University Alliance.

ai technology, cognition, diversity and inclusion, (15 more...)

Robohub

Country: Europe > Netherlands > North Holland > Amsterdam (0.25)

Genre: Personal (0.69)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (0.50)

Add feedback

EleutherAI: Going Beyond "Open Science" to "Science in the Open"

Phang, Jason, Bradley, Herbie, Gao, Leo, Castricato, Louis, Biderman, Stella

arXiv.org Artificial IntelligenceOct-12-2022

Over the past two years, EleutherAI has established itself as a radically novel initiative aimed at both promoting open-source research and conducting research in a transparent, openly accessible and collaborative manner. EleutherAI's approach to research goes beyond transparency: by doing research entirely in public, anyone in the world can observe and contribute at every stage. Our work has been received positively and has resulted in several high-impact projects in Natural Language Processing and other fields. In this paper, we describe our experience doing public-facing machine learning research, the benefits we believe this approach brings, and the pitfalls we have encountered.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2210.06413

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)

Add feedback

naab: A ready-to-use plug-and-play corpus for Farsi

Sabouri, Sadra, Rahmati, Elnaz, Gooran, Soroush, Sameti, Hossein

arXiv.org Artificial IntelligenceAug-29-2022

Huge corpora of textual data are always known to be a crucial need for training deep models such as transformer-based ones. This issue is emerging more in lower resource languages - like Farsi. We propose naab, the biggest cleaned and ready-to-use open-source textual corpus in Farsi. It contains about 130GB of data, 250 million paragraphs, and 15 billion words. The project name is derived from the Farsi word NAAB K which means pure and high grade. We also provide the raw version of the corpus called naab-raw and an easy-to-use preprocessor that can be employed by those who wanted to make a customized corpus.

corpora, corpus, farsi, (14 more...)

arXiv.org Artificial Intelligence

2208.13486

Country:

Europe > Germany > Saxony > Leipzig (0.05)
Asia > Middle East > Iran > Tehran Province > Tehran (0.05)
North America > Dominican Republic (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

BLOOM Is the Most Important AI Model of the Decade

#artificialintelligenceJul-7-2022, 04:05:14 GMT

You may be wondering if such a bold headline is true. GPT-3 came out in 2020 and established a new road the whole AI industry has been following in intention and attention since. Tech companies have repeatedly built better, larger models, one after another. But although they've put millions into the task, none of them has fundamentally changed the leading paradigm or the game's rules GPT-3 laid out two years ago. Gopher, Chinchilla, and PaLM (arguably the current podium of large language models) are significantly better than GPT-3 but they are, in essence, more of the same thing.

bigscience and bloom, bloom, transformer-based model, (13 more...)

#artificialintelligence

Country: North America > Canada > Quebec > Montreal (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.31)

Add feedback